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Can Neural Networks Recognize Parts? 
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We have demonstrated neural networks can recognize parts by visual images. Input signals 
are gray scale photographs of objects consisting of some parts and output signals are their 
shapes. By training neural networks by a few set of images, without any supervision they 
' become to be able to recognize the boundary between parts. 
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1. Introduction 

Visual intelligence (VI) 1 plays very important roles in the visual cognition. In retina 

system, we can accept only two dimensional projections of three dimensional objects. Without 

any other informations, we have to recognize three dimensional object from it. Of course, 

there are infinitely many interpretations of this two dimensional image received, but usually 

we reconstruct unique three dimensional world. And it is often the proper interpretation 

O . (otherwise, we would be extinct). 

• r- 1 . 

VI provides us the set of rules of interpretation to have these proper reconstructions of 
three dimensional space. There are many tasks to be solved by VI, for example, reconstruction 
of roughness from gray scale image, 2 recognition of depth from line drawings, 3 and decision 
of motion from sequential still images. 4 

One of such tasks is to recognize parts. 5 When we view a pair of iron dumbbells, we 
recognize it as two spheres connected by a rod. Although there are some theories 5 to explain 
how we can divide a pair of iron dumbbells into three parts, there are no theories about how 
we can learn rules suggested by these theories. In this paper, we demonstrate that even a set 
of simple neural networks can become to be able to recognize parts without any supervision if 
many enough number of combinations of parts are presented, even if there are no informations 
about what each part is. It seems to be very easier process than imagined. 

In §2, we have defined the objects from which we generate visual images. Section 3 describes 
how to train neural networks so that it recognizes three dimensional shapes and parts from 
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Fig. 1. Examples of images used for training 

the visual images. Discussions and Conclusions are in §4 and §5, respectively. 
2. Objects used 

In order to make neural networks learn what the parts are, we have to present grey scaled 
images of three dimensional objects. However, if the objects are too complicated, training 
neural networks to learn them is simply time consuming. It is a waste of time. We need some 
simple images which are two dimensional projection of a set of three dimensional objects and 
are easily recognized as a set of parts by human beings. As such examples, we employ the 
images shown in Figs. 1. If someone asks "What does Fig. 1(a) look like?", the answer may 
be "Five hemispheres on a plate with a round hollow" . These "hemispheres" and "a hollow" 
are the parts. It is very easy for us to recognize these parts. But how did we become to be 
able to do this? 



3. Training neural networks 

In order to check how easy it is to learn what the parts are, we try to train neural networks 
to recognize them. The neural networks used are standard three layered perceptrons, 
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and Xi,yj and Zk are values of the input neurons, the neurons in the hidden layer and the 
output neurons respectively, a^s and bj^s are connection coefficients which are trained by 
usual back propagation procedure. 

In order to decide values of input XjS, we have subdivided a image into 20 x 30 lattices 
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X=(X1,X2, ,X600) 

Fig. 2. Inputs Xi,(i = 1, .., 600) 

(Fig. 2). Xi(i = 1, ..,600) takes 1(0) if center pixel is white (black). In total, 2 6 = 64 images 
can be considered because each of the six parts has two possibilities that it can take. 

3. 1 Recognition of a hollow or a hemisphere 

First, we would like to check whether neural networks can recognize both a hollow and 
a hemisphere successfully. Thus, we define six (k = 1, ..,6) as follows, while suffix k cor- 
responds to one of six parts; if the fcth part is a hemisphere (a round hollow), takes 1(0). 
We have employed 600 neurons in the hidden layer. Although one may think that it is too 
large for this simple task, it is not the case because later we use this for learning the three 
dimensional shapes. 

In Fig. 3, we have shown the dependence of the average number of patterns S recognized 
correctly by trained neural networks upon a number n of images used for training. Of course, 
ZkS take non integer values between and 1, but we regard z^ = 1(0) when Zk > (<)0.5. 
Averages are taken over ten independent training for each n. As can be seen easily, if n is 
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Fig. 3. Averaged number of correctly recognized patterns out of total 64 images as a function of 
number of trained images, (a hollow or a sphere recognition) 

























































































































































































































1.0 










0.0 














































0.5 































h=(hi,h2, .hiso) 




Fig. 4. Output hiS for three dimensional shape recognition. 

larger than one third of total number of images, neural networks correctly recognize hollows 
and hemispheres for all images. Thus, neural networks can recognize a hollow and a hemisphere 
correctly. 

3. 2 Recognition of 3D shapes 

Next we try to make neural networks recognize three dimensional shapes. This time, 
outputs ZfcS are the coarse grained height hk of a hollow or a hemisphere (Fig. 4). We have 
subdivided surface of three dimensional shapes into 15 x 10 = 150 lattices, hk takes value 
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output 

Fig. 5. Output his by trained neural networks. Input is an unknown (not used for training) image. 

between 1 and 0. The surface of flat plate is regarded to have height 0.5, and the bottom of 
hollows has and the top of hemisphere has 1.0. 

In Fig. 5, we have shown the ability of trained neural networks. This neural networks are 
trained using 24 out of total 64 images. Then a image not used for training is presented. As 
can be seen easily, the neural networks can easily recognize the 3D shape even if unknown 
image is presented. 

In Fig. 6, we have shown the dependence of the average number of patterns S recognized 
correctly by trained neural networks upon a number n of images used for training. It is possible 
for neural networks to learn 3D shapes if n is larger than 15. Thus, neural networks correctly 
recognize 3D shapes. 



3.3 Recognition of parts 

Until now, we did not provide any information about what the parts are. However, neural 
networks have learned it as shown below. In order to see whether the neural networks rec- 
ognize parts, we have shown three hollows/spheres on a flat plane to the neural networks. If 
they can recognize what the parts are, they can reproduce 3D shapes. As shown in Fig. 7, 
neural networks can recognize each hollow/hemisphere as a part. Even if there is only one 
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Fig. 6. Averaged number of correctly recognized patterns out of total 64 images as a function of 
number of trained images. (3D shapes recognition) 





Fig. 7. Recognition of parts 



hollow/hemisphere on a plate, they can reproduce three dimensional shapes correctly. This 
means that without any supervisions, neural networks can recognize each hollow/hemisphere 
as a part. 

4. Discussion 

How do neural networks relate the regions of a grey scale image to the regions on a flat 
plate? We did not provide such a information at all. However, once neural networks recognize 
correspondence between parts in images (input information) and parts in 3D shapes (output 
information), it is essentially to find relations between 6 bit input and 6 bits output (In bit 
interpretation for example, a hemisphere corresponds to 1 and a hollow corresponds to 0.). 
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Fig. 8. (a) Averaged number of correctly recognized patterns out of total 64 images as a function 
of number of trained images (For 6 input/output neurons), (b) The same as (a) for 3D shape 
recognition with 50 neurons in hidden layers 



Thus, it is a very easy task for neural networks. 

In order to check the easiness, we use 6 neurons at input and output layers and 30 neurons 
at hidden layers. x% and take or 1 and neural networks are trained such that x% = 
when i = k. As shown in Fig. 8(a), it is possible for neural networks to do this. Thus, at 
maximum, neural networks need only 30 neurons in hidden layer. Thus, if neural networks 
recognize parts, the numbers of neurons can be as small as 600. 

Actually, as shown in Fig. 8(b), neural networks can have the same ability as Fig. 6 even 
if the number of neurons is only 50. This is almost the number of neurons in the hidden layer 
of neural networks whose ability is shown in Fig. 8(a). Thus we can conclude that neural 
networks recognize parts well and can drastically reduce the number of neurons in hidden 
layers. This is how to learn what the parts are. If the simple network can recognize it so 
easily, our neuron can do the same without difficulty. This may be the reason why we became 
to be able to recognize the image as a set of parts. It can reduce the number of neurons in 
hidden layers drastically as expected. Without recognition of parts, it is impossible to reduce 
the number of neurons in the hidden layer. 

5. Conclusion 

In conclusion, we have shown that neural networks can divide the image into parts auto- 
matically during training process. It turns out to reduce number of used neurons in hidden 
layer, i.e., memories drastically. This may be the reason why we became to be able to recognize 
the image as a set of parts. 
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